A basic limitation on inferring phylogenies by pairwise sequence comparisons.
نویسنده
چکیده
Distance-based approaches in phylogenetics such as Neighbor-Joining are a fast and popular approach for building trees. These methods take pairs of sequences, and from them construct a value that, in expectation, is additive under a stochastic model of site substitution. Most models assume a distribution of rates across sites, often based on a gamma distribution. Provided the (shape) parameter of this distribution is known, the method can correctly reconstruct the tree. However, if the shape parameter is not known then we show that topologically different trees, with different shape parameters and associated positive branch lengths, can lead to exactly matching distributions on pairwise site patterns between all pairs of taxa. Thus, one could not distinguish between the two trees using pairs of sequences without some prior knowledge of the shape parameter. More surprisingly, this can happen for any choice of distinct shape parameters on the two trees, and thus the result is not peculiar to a particular or contrived selection of the shape parameters. On a positive note, we point out known conditions where identifiability can be restored (namely, when the branch lengths are clocklike, or if methods such as maximum likelihood are used).
منابع مشابه
Mammalian phylogeny: comparison of morphological and molecular results.
In an attempt to resolve the "bushy" part at the root of the eutherian tree, 182 nondental morphological characters from 100 species (79 extant and 21 extinct; 98 mammalian and 2 nonmammalian) were analyzed using two maximum-parsimony tree-building algorithms. Parallel analyses of 2,258 pairwise immunodiffusion comparisons with chicken antisera on 101 mammalian species and of amino acid sequenc...
متن کاملTitle: A weighted least-squares approach for inferring phylogenies from incomplete distance matrices Authors:
Motivation: The problem of phylogenetic inference from data sets including incomplete or uncertain entries is among the most relevant issues in systematic biology. In this paper, we propose a new method for reconstructing phylogenetic trees from partial distance matrices. The new method combines the usage of the four-point condition and the ultrametric inequality with a weighted least-squares a...
متن کاملAn alternating least squares approach to inferring phylogenies from pairwise distances.
A computational method is presented for minimizing the weighted sum of squares of the differences between observed and expected pairwise distances between species, where the expectations are generated by an additive tree model. The criteria of Fitch and Margoliash (1967, Science 155:279-284) and Cavalli-Sforza and Edwards (1967, Evolution 21:550-570) are both weighted least squares, with differ...
متن کاملPairwise alignment with rearrangements.
The increase of available genomes poses new optimization problems in genome comparisons. A genome can be considered as a sequence of characters (loci) which are genes or segments of nucleotides. Genomes are subject to both nucleotide transformation and character order rearrangement processes. In this context, we define a problem of so-called pairwise alignment with rearrangements (PAR) between ...
متن کاملProspects for inferring very large phylogenies by using the neighbor-joining method.
Current efforts to reconstruct the tree of life and histories of multigene families demand the inference of phylogenies consisting of thousands of gene sequences. However, for such large data sets even a moderate exploration of the tree space needed to identify the optimal tree is virtually impossible. For these cases the neighbor-joining (NJ) method is frequently used because of its demonstrat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of theoretical biology
دوره 256 3 شماره
صفحات -
تاریخ انتشار 2009